[MXNET-753] Fallback when using non-MKLDNN supported operators #12019

azai91 · 2018-08-03T01:38:12Z

Description

-convert all mkldnn special format to default in using non-mkldnn operator

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

add attribute to all mkldnn operators to identify mkldnn
convert inputs to default array if using fcomputeex operator that is not mkldnn

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

pengzhao-intel · 2018-08-03T02:41:53Z

@azai91 Very nice improvements!

btw, I see you only add the attr in conv. In general, we need to add for all MKLDNN OP, right?

@ZhennanQin @TaoLv for further comments.

ZhennanQin · 2018-08-03T02:54:26Z

@azai91 Good improvement before subgraph ready. I'm not familiar with engine part, can this change cover all kinds of executing scenarios, like any combination of NaiveEngine, ThreadedEngine, symbolic, gluon, hybridized gluon, dynamic memory allocation, static memory allocation?

eric-haibin-lin · 2018-08-03T16:23:25Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

@@ -351,6 +351,13 @@ static inline void InvalidateOutputs(const std::vector<NDArray> &arrs,
  }
 }

+static inline std::vector<NDArray> InvalidateInputs(const std::vector<NDArray> &arrs) {


The name is confusing and suggests mutation on the inputs.

eric-haibin-lin · 2018-08-03T16:23:45Z

src/executor/attach_op_execs_pass.cc

@@ -226,6 +228,11 @@ class FComputeExExecutor : public OpExecutor {
    op_ctx.run_ctx = rctx;
 #if MXNET_USE_MKLDNN == 1
    InvalidateOutputs(out_array, req);
+    const auto is_mkldnn = Op::GetAttr<bool>("TIsMKLDNN");
+    if (!is_mkldnn.get(attrs_.op, false)) {
+      fcompute_(attrs_, op_ctx, InvalidateInputs(in_array), req, out_array);


Is it thread safe to modify inputs?

apeforest · 2018-08-20T17:03:01Z

tests/python/mkl/test_mkldnn.py

+
+    data = mx.symbol.Variable('data')
+    conv = mx.sym.Convolution(data=data, kernel=(5, 5), pad=(1, 1), stride=(1,1), num_filter=8, name="conv", no_bias=True)
+    mlp = mx.symbol.Custom(name='custom', data=conv, op_type='custom')


nit: renaming mlp to custom or custom_op

apeforest · 2018-08-20T17:11:14Z

src/operator/tensor/elemwise_unary_op.h

@@ -299,7 +299,11 @@ class UnaryOp : public OpBase {
        }
        break;
      case kWriteInplace:
+// cannot check if ptrs are the same for MKLDNN because we may have
+// created copies of input when reordering. WriteInPlace will still write to original array
+#if MXNET_USE_MKLDNN != 1


change to #if MXNET_USE_MKLDNN == 0

apeforest · 2018-08-20T17:31:15Z

src/executor/attach_op_execs_pass.cc

@@ -40,6 +40,11 @@ const OperatorProperty* OpPropGetOpProperty(const NodeAttrs& attrs);

 namespace exec {

+class MKLDNNOpExecutor : public OpExecutor {
+ protected:
+  std::vector<NDArray> in_array_fallback;


Why do we need an extra copy rather than just editing the in_array directly?

there is a race condition if we attempt to reorder the read_var in place. other operators may be trying to read from it the same time (since they are expected to be read only consts)

If that's the case, why do we even bother to create a subclass with this extra data member?

apeforest · 2018-08-20T18:04:12Z

src/operator/nn/activation.cc

@@ -182,6 +182,7 @@ The following activation functions are supported:
 })
 .set_attr<FCompute>("FCompute<cpu>", ActivationCompute<cpu>)
 #if MXNET_USE_MKLDNN == 1
+.set_attr<bool>("TIsMKLDNN", true)


Is there a more generic way to add this instead of doing this for all operators? What if we later we add new operators, should we document this somewhere?

it only needs to be added to MKLDNN operators. this fix was a temporary solution while we get the subgraph feature implemented. we weighed the pros / cons of waiting to release a stable MKLDNN stable with this hack or waiting another month for subgraph to be introduced (possibly with it's own bugs) and decided we would use this short term solution.

Please mark this TODO and create a JIRA ticket to remove this later after MKLDNN support is released.

added TODO to opexecuter

apeforest · 2018-08-22T23:01:54Z

src/executor/attach_op_execs_pass.cc

 public:
  void Run(RunContext rctx, bool is_gpu) override {
    op_ctx.run_ctx = rctx;
 #if MXNET_USE_MKLDNN == 1
    InvalidateOutputs(out_array, req);
+    in_array_fallback = CreateDefaultInputs(in_array);


Can we just define the array here instead of creating an extra data member in the class?

apeforest · 2018-08-22T23:03:20Z

src/executor/attach_op_execs_pass.cc

@@ -153,12 +158,15 @@ class StatefulComputeExecutor : public StorageFallbackOpExecutor {


 // stateful compute_ex executor
-class StatefulComputeExExecutor : public OpExecutor {
+class StatefulComputeExExecutor : public MKLDNNOpExecutor {


Making this class derived from MKLDNNOpExecutor while it's only valid when MXNET_USE_MKLDNN == 1 violates the inheritance class is-a relationship

apeforest · 2018-08-22T23:31:04Z

src/executor/exec_pass.h

@@ -86,6 +86,9 @@ class OpExecutor {
  virtual OpStatePtr state() const {
    return OpStatePtr();
  }
+
+ protected:
+  std::vector<NDArray> in_array_fallback;


Do we really need this as a class data member? Or can we just declare it as a local variable when it is created because I don't see it used anywhere else in the flow.

apeforest · 2018-08-22T23:31:44Z

src/executor/attach_op_execs_pass.cc

@@ -159,6 +159,9 @@ class StatefulComputeExExecutor : public OpExecutor {
    op_ctx.run_ctx = rctx;
 #if MXNET_USE_MKLDNN == 1
    InvalidateOutputs(out_array, req);
+    in_array_fallback = CreateDefaultInputs(in_array);


Can we just declare std::vector in_array_fallback here?

we need the fallback arrays to stay in memory or else we segfault.

Cannot see why that is the case. But creating a data member is not a good solution to this. I am not very comfortable of intruding a new data member to this class without much justification. If this PR is only for a temporary workaround I am okay to approve. But please create a JIRA ticket to remove this hack later.

apeforest · 2018-08-23T16:34:51Z

src/executor/exec_pass.h

@@ -86,6 +86,9 @@ class OpExecutor {
  virtual OpStatePtr state() const {
    return OpStatePtr();
  }
+
+ protected:
+  std::vector<NDArray> in_array_fallback;


Please make sure to add TODO with JIRA number to remove this hack.

apeforest · 2018-08-23T16:35:32Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

@@ -356,6 +356,17 @@ static inline void InvalidateOutputs(const std::vector<NDArray> &arrs,
  }
 }

+static inline std::vector<NDArray> CreateDefaultInputs(const std::vector<NDArray> &arrs) {


Also add TODO to remove this unnecessary function when final solution is implemented.

apeforest

LGTM (given that this is a temporary workaround before the subgraph optimization is integrated and related JIRA been assigned)

anirudh2290 · 2018-08-24T18:13:35Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

@@ -356,6 +356,18 @@ static inline void InvalidateOutputs(const std::vector<NDArray> &arrs,
  }
 }

+// TODO(alexzai): (MXNET-856) Remove helper function after subgraph feature added
+static inline std::vector<NDArray> CreateDefaultInputs(const std::vector<NDArray> &arrs) {


can we pass a pointer to a vector as an argument instead of returning a vector.

…e#12019) * add fallback test * wait to read throws error * add TIsMKLDNN attr * invalidate inputs if fcomputeex unsupported * keep ptr to newly created default arrays * add flag to all mkldnn operators * update method name to CreateDefaultInputs * remove dup attrs * create new instance var to store copy * only reorder if mkldnn * add mkldnn flag to batch norm * do not check input / output ptr for mkldnn as copied is made * fix lint * update macro * update custom update name * add todo for fallback * fix lint * rename opexecutor name * add fallback to opexecutor class * fix lint * add todos * create fallback arrays in place * revert in place diff * create copy of arrays for fallback * empty array

pengzhao-intel · 2018-09-28T02:34:41Z

@azai91 @zheng-da this PR caused the performance regression, such as resnet152
Please help fix the issue ASAP.

azai91 requested a review from anirudh2290 as a code owner August 3, 2018 01:38

eric-haibin-lin reviewed Aug 3, 2018

View reviewed changes

azai91 force-pushed the fix/mkldnn-fallback branch 6 times, most recently from ec1d179 to 3bc9d6d Compare August 7, 2018 19:40

azai91 mentioned this pull request Aug 8, 2018

MKLDNN can be turned off with env var #12058

Merged

6 tasks

nswamy added Operator MKLDNN pr-awaiting-review PR is waiting for code review labels Aug 8, 2018

apeforest suggested changes Aug 20, 2018

View reviewed changes

apeforest reviewed Aug 20, 2018

View reviewed changes

apeforest reviewed Aug 22, 2018

View reviewed changes

azai91 force-pushed the fix/mkldnn-fallback branch from 59f18a5 to c356e6c Compare August 23, 2018 06:29

apeforest reviewed Aug 23, 2018

View reviewed changes

apeforest approved these changes Aug 23, 2018

View reviewed changes

anirudh2290 reviewed Aug 24, 2018

View reviewed changes

azai91 force-pushed the fix/mkldnn-fallback branch 5 times, most recently from fe7f954 to da69adb Compare August 28, 2018 15:16

azai91 added 20 commits August 30, 2018 08:00

keep ptr to newly created default arrays

6b6b1c5

add flag to all mkldnn operators

e548092

update method name to CreateDefaultInputs

54e08d0

remove dup attrs

cf28508

create new instance var to store copy

beff2f1

only reorder if mkldnn

403f601

add mkldnn flag to batch norm

449339a

do not check input / output ptr for mkldnn as copied is made

79e9a75

fix lint

291bfb7

update macro

e4d673b

update custom update name

c548d6c

add todo for fallback

92942e7

fix lint

b64ea5b

rename opexecutor name

8668f01

add fallback to opexecutor class

c169b9b

fix lint

f201737

add todos

997faa0

create fallback arrays in place

a26e739

revert in place diff

de9cffa

create copy of arrays for fallback

f3e55e9

azai91 force-pushed the fix/mkldnn-fallback branch from bea0609 to f3e55e9 Compare August 30, 2018 15:27

empty array

dcaba17

azai91 mentioned this pull request Aug 30, 2018

MKLDNN fallback when not recording gradients and calling backwards #12411

Closed

7 tasks

anirudh2290 approved these changes Aug 30, 2018

View reviewed changes

anirudh2290 merged commit 32c9ca7 into apache:master Aug 30, 2018

pengzhao-intel mentioned this pull request Sep 28, 2018

Performance Regression from #12019 #12703

Closed

azai91 mentioned this pull request Oct 4, 2018

Fix regression in MKLDNN caused by PR 12019 #12740

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-753] Fallback when using non-MKLDNN supported operators #12019

[MXNET-753] Fallback when using non-MKLDNN supported operators #12019

azai91 commented Aug 3, 2018 •

edited

Loading

pengzhao-intel commented Aug 3, 2018 •

edited

Loading

ZhennanQin commented Aug 3, 2018 •

edited

Loading

eric-haibin-lin Aug 3, 2018

eric-haibin-lin Aug 3, 2018

eric-haibin-lin Aug 3, 2018

apeforest Aug 20, 2018

apeforest Aug 20, 2018

apeforest Aug 20, 2018

azai91 Aug 20, 2018

apeforest Aug 22, 2018

apeforest Aug 20, 2018

azai91 Aug 20, 2018 •

edited

Loading

apeforest Aug 20, 2018

azai91 Aug 20, 2018

apeforest Aug 22, 2018 •

edited

Loading

apeforest Aug 22, 2018 •

edited

Loading

apeforest Aug 22, 2018

apeforest Aug 22, 2018

azai91 Aug 22, 2018

apeforest Aug 23, 2018

apeforest Aug 23, 2018

apeforest Aug 23, 2018

apeforest left a comment

anirudh2290 Aug 24, 2018

pengzhao-intel commented Sep 28, 2018 •

edited

Loading

[MXNET-753] Fallback when using non-MKLDNN supported operators #12019

[MXNET-753] Fallback when using non-MKLDNN supported operators #12019

Conversation

azai91 commented Aug 3, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

pengzhao-intel commented Aug 3, 2018 • edited Loading

ZhennanQin commented Aug 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azai91 Aug 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest Aug 22, 2018 • edited Loading

Choose a reason for hiding this comment

apeforest Aug 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pengzhao-intel commented Sep 28, 2018 • edited Loading

azai91 commented Aug 3, 2018 •

edited

Loading

pengzhao-intel commented Aug 3, 2018 •

edited

Loading

ZhennanQin commented Aug 3, 2018 •

edited

Loading

azai91 Aug 20, 2018 •

edited

Loading

apeforest Aug 22, 2018 •

edited

Loading

apeforest Aug 22, 2018 •

edited

Loading

pengzhao-intel commented Sep 28, 2018 •

edited

Loading